pca tree
- North America > Canada > Ontario > Toronto (0.14)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > United States > California > Merced County > Merced (0.04)
- (7 more...)
The tree autoencoder model, with application to hierarchical data visualization
We propose a new model for dimensionality reduction, the PCA tree, which works like a regular autoencoder, having explicit projection and reconstruction mappings. The projection is effected by a sparse oblique tree, having hard, hyperplane splits using few features and linear leaves. The reconstruction mapping is a set of local linear mappings. Thus, rather than producing a global map as in t-SNE and other methods, which often leads to distortions, it produces a hierarchical set of local PCAs. The use of a sparse oblique tree and PCA makes the overall model interpretable and very fast to project or reconstruct new points. Joint optimization of all the parameters in the tree is a nonconvex nondifferentiable problem. We propose an algorithm that is guaranteed to decrease the error monotonically and which scales to large datasets without any approximation. In experiments, we show PCA trees are able to identify a wealth of low-dimensional and cluster structure in image and document datasets.
- North America > Canada > Ontario > Toronto (0.14)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > United States > California > Merced County > Merced (0.04)
- (8 more...)
The tree autoencoder model, with application to hierarchical data visualization
We propose a new model for dimensionality reduction, the PCA tree, which works like a regular autoencoder, having explicit projection and reconstruction mappings. The projection is effected by a sparse oblique tree, having hard, hyperplane splits using few features and linear leaves. The reconstruction mapping is a set of local linear mappings. Thus, rather than producing a global map as in t-SNE and other methods, which often leads to distortions, it produces a hierarchical set of local PCAs. The use of a sparse oblique tree and PCA makes the overall model interpretable and very fast to project or reconstruct new points. Joint optimization of all the parameters in the tree is a nonconvex nondifferentiable problem.
Efficient Autotuning of Hyperparameters in Approximate Nearest Neighbor Search
Jääsaari, Elias, Hyvönen, Ville, Roos, Teemu
Approximate nearest neighbor algorithms are used to speed up nearest neighbor search in a wide array of applications. However, current indexing methods feature several hyperparameters that need to be tuned to reach an acceptable accuracy--speed trade-off. A grid search in the parameter space is often impractically slow due to a time-consuming index-building procedure. Therefore, we propose an algorithm for automatically tuning the hyperparameters of indexing methods based on randomized space-partitioning trees. In particular, we present results using randomized k-d trees, random projection trees and randomized PCA trees. The tuning algorithm adds minimal overhead to the index-building process but is able to find the optimal hyperparameters accurately. We demonstrate that the algorithm is significantly faster than existing approaches, and that the indexing methods used are competitive with the state-of-the-art methods in query time while being faster to build.
- Europe > Finland > Uusimaa > Helsinki (0.05)
- Asia > Afghanistan > Parwan Province > Charikar (0.04)
- Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.87)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (0.84)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.48)
Approximate Principal Direction Trees
McCartin-Lim, Mark, McGregor, Andrew, Wang, Rui
We introduce a new spatial data structure for high dimensional data called the \emph{approximate principal direction tree} (APD tree) that adapts to the intrinsic dimension of the data. Our algorithm ensures vector-quantization accuracy similar to that of computationally-expensive PCA trees with similar time-complexity to that of lower-accuracy RP trees. APD trees use a small number of power-method iterations to find splitting planes for recursively partitioning the data. As such they provide a natural trade-off between the running-time and accuracy achieved by RP and PCA trees. Our theoretical results establish a) strong performance guarantees regardless of the convergence rate of the power-method and b) that $O(\log d)$ iterations suffice to establish the guarantee of PCA trees when the intrinsic dimension is $d$. We demonstrate this trade-off and the efficacy of our data structure on both the CPU and GPU.
- North America > United States > Massachusetts > Hampshire County > Amherst (0.14)
- North America > United States > North Carolina (0.04)
- Europe > United Kingdom > Scotland > City of Edinburgh > Edinburgh (0.04)